Simultaneous model-based clustering and visualization in the Fisher discriminative subspace
نویسندگان
چکیده
Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult task from both the clustering accuracy and the result understanding points of view. This paper presents a discriminative latent mixture (DLM) model which fits the data in a latent orthonormal discriminative subspace with an intrinsic dimension lower than the dimension of the original space. By constraining model parameters within and between groups, a family of 12 parsimonious DLM models is exhibited which allows to fit onto various situations. An estimation algorithm, called the Fisher-EM algorithm, is also proposed for estimating both the mixture parameters and the discriminative subspace. Experiments on simulated and real datasets highlight the good performance of the proposed approach as compared to existing clustering methods while providing a useful representation of the clustered data. The method is as well applied to the clustering of mass spectrometry data.
منابع مشابه
Theoretical and practical considerations on the convergence properties of the Fisher-EM algorithm
The Fisher-EM algorithm has been recently proposed in [4] for the simultaneous visualization and clustering of high-dimensional data. It is based on a latent mixture model which fits the data into a latent discriminative subspace with a low intrinsic dimension. Although the Fisher-EM algorithm is based on the EM algorithm, it does not respect at a first glance all conditions of the EM convergen...
متن کاملClustering in Fisher Discriminative Subspaces
Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult problem. This is mainly due to the fact that high-dimensional data usually live in low-dimensional subspaces hidden in the original space. This paper presents a model-based clustering approach which models the data in a discriminative subspace with an intrinsic dimension lowe...
متن کاملDiscriminative K-means for Clustering
We present a theoretical study on the discriminative clustering framework, recently proposed for simultaneous subspace selection via linear discriminant analysis (LDA) and clustering. Empirical results have shown its favorable performance in comparison with several other popular clustering algorithms. However, the inherent relationship between subspace selection and clustering in this framework...
متن کاملDiscriminative Training of Subspace Gaussian Mixture Model for Pattern Classification
The Gaussian mixture model (GMM) has been widely used in pattern recognition problems for clustering and probability density estimation. For pattern classification, however, the GMM has to consider two issues: model structure in high-dimensional space and discriminative training for optimizing the decision boundary. In this paper, we propose a classification method using subspace GMM density mo...
متن کاملSpherical Discriminant Analysis in Semi-supervised Speaker Clustering
Semi-supervised speaker clustering refers to the use of our prior knowledge of speakers in general to assist the unsupervised speaker clustering process. In the form of an independent training set, the prior knowledge helps us learn a speaker-discriminative feature transformation, a universal speaker prior model, and a discriminative speaker subspace, or equivalently a speaker-discriminative di...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Statistics and Computing
دوره 22 شماره
صفحات -
تاریخ انتشار 2012